Using Asymmetric Distributions to Improve Classifier Probabilities: A Comparison of New and Standard Parametric Methods
نویسنده
چکیده
For many discriminative classifiers, it is desirable to convert an unnormalized confidence score output from the classifier to a normalized probability estimate. Such a method can also be used for creating better estimates from a probabilistic classifier that outputs poor estimates. Typical parametric methods have an underlying assumption that the score distribution for a class is symmetric; we motivate why this assumption is undesirable, especially when the scores are output by a classifier. Two asymmetric families, an asymmetric generalization of a Gaussian and a Laplace distribution, are presented, and a method of fitting them in expected linear time is described. Finally, an experimental analysis of parametric fits to the outputs of two text classifiers, naı̈ve Bayes (which is known to emit poor probabilities) and a linear SVM, is conducted. The analysis shows that one of these asymmetric families is theoretically attractive (introducing few new parameters while increasing flexibility), computationally efficient, and empirically preferable. Email: [email protected]
منابع مشابه
A parametric model for predicting cut point of hydraulic classifiers
A new parametric model was developed for predicting cut point of hydraulic classifiers. The model directly uses operating parameters including pulp flowrate, feed particle size characteristics, pulp solids content, solid density and particles retention time in the classification chamber and also covers uncontrollable errors using calibration constants. The model applicability was first verified...
متن کاملComparison of Parametric and Non-parametric EEG Feature Extraction Methods in Detection of Pediatric Migraine without Aura
Background: Migraine headache without aura is the most common type of migraine especially among pediatric patients. It has always been a great challenge of migraine diagnosis using quantitative electroencephalography measurements through feature classification. It has been proven that different feature extraction and classification methods vary in terms of performance regarding detection and di...
متن کاملHyperbolic Cosine Log-Logistic Distribution and Estimation of Its Parameters by Using Maximum Likelihood Bayesian and Bootstrap Methods
In this paper, a new probability distribution, based on the family of hyperbolic cosine distributions is proposed and its various statistical and reliability characteristics are investigated. The new category of HCF distributions is obtained by combining a baseline F distribution with the hyperbolic cosine function. Based on the base log-logistics distribution, we introduce a new di...
متن کاملA comparison of parametric and non-parametric methods of standardized precipitation index (SPI) in drought monitoring (Case study: Gorganroud basin)
The Standardized Precipitation Index (SPI) is the most common index for drought monitoring. Although the calculation of this index is usually done by using the gamma distribution fitting of precipitation data, studies have shown that for accurate monitoring of drought, the optimal distribution of precipitation in each month should be determined. On the other hand, in non-stationary time series,...
متن کاملتحلیل ممیز غیرپارامتریک بهبودیافته برای دستهبندی تصاویر ابرطیفی با نمونه آموزشی محدود
Feature extraction performs an important role in improving hyperspectral image classification. Compared with parametric methods, nonparametric feature extraction methods have better performance when classes have no normal distribution. Besides, these methods can extract more features than what parametric feature extraction methods do. Nonparametric feature extraction methods use nonparametric s...
متن کامل